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Abstract. We give a short proof of an extension of the free Talagrand trans- 
portation cost inequality to the semicircular which was originally proved in [l] . 
The proof is based on a convexity argument and is in the spirit of the original 
Talagrand's approach for the classical counterpart from [8]. We also discuss 
the convergence, fluctuations and large deviations of the energy of the eigen- 
values of /3 ensembles, which, as an application of Talagrand inequality gives in 
particular yet another proof of the convergence of the eigenvalue distribution 
to the semicircle law. 



1. Introduction 

In [5] , Talagrand proves the transportation cost inequality to the Gaussian mea- 
sure. The one dimensional version for the Gaussian measure ^y(dx) — -j=e~ x / 2 dx 
reads as 

(1.1) (W 2 (p :1 )) 2 <2H(v\ 1 ), 

where W 2 (p,j) is the Wasserstein distance defined below by (|2.2p and the relative 
entropy is 

7 f(x) Iog(/(a:))d7(a:) Mv{dx) = f{x)<y(dx) 
oo if v is singular to 7. 



In the context of free probability, Biane and Voiculescu proved in [T] a free version 
of this: 

(1.2) (W 2 (p,cr)) 2 <2(E(j i )-E(v)), 

where E(p) = A J" x 2 p(dx) — J"Jlog(|x — y\)p(dx)p,(dy) is the free energy of p and 
a(dx) = ^:l[-2,2](a ; )"\/ 4 — x 2 dx is the semicircular law, the minimizer of E(p) over 
all probability measures on the real line. The role of the relative entropy is played 
here by the difference of the free energy of p and the semicircular. 

Using random matrix approximations, Hiai, Petz and Ueda proved in [7] the 
following extension of (jl.2p . 

(1-3) p(W 2 (n,[i Q ) f < £«( M ) - E Q (v Q ) 

where p > and Q : K — > R is a function so that Q(x) — px 2 is convex and 

E Q {n) - / Q(x)p(dx) - J J \og\x - y\p(dx)p(dy). 

Here pq is the minimizer of E® on the set of all probability measures on the real 
line. They also prove a version of this for measures supported on the circle T: 

(1.4) (p+l/4)(W 2 (iy,iy Q )) 2 < E Q {v) - E Q {v Q ) 
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where Q : T — > R so that Q{e tx ) — px 2 is convex on 1, p > —1/4 and /iq is the 
minimizcr of the functional on probability measures on the unit circle T. 

Another proof of (|1.2[) is given in [5] via a Brunn-Minkovsky inequality for free 
probability. 

The primary purpose of this note is to give an elementary proof of (| 1 . 3[) and 
(|1.4[) in the spirit of Talagrand's proof to (jl.ip . The idea is to exploit convexity 
of the logarithm appearing in the E® . We also discuss (see Theorem 12.161 and 
Proposition I2.20|l the discrete version of the transportation cost inequalities and 
some consequences involving Fekete points. 

The second purpose of this note is to discuss the energy of the eigenvalues of j3 
ensembles and in particular the fluctuations and the deviations from the minimum 
energy (see Theorem 13 .lj) . This is a simple application of Selberg's formula together 
with elementary estimates on T functions. As a consequence, using the the results 
in the first part we reprove that the distribution of the eigenvalues converges almost 
surely to the semicircular law. 

2. Talagrand Inequalities 

The following result is an obvious one but is the key to our problem. 

Lemma 2.1. Let f : [0, 1] — > M be a convex function with the property that f(0) = 
and there exists a > so that 

f(t) > -at 2 for te [0,1]. 

Then 

f(t)>0 for all te[0,l]. 

Proof. It follows from the assumptions that for any e > 0, if S e = min(l, e/a), then 
f(t) > —te for t 6 [0, S e ]. Now, since / is convex, one gets f{mt) > mf{t) > —mte 
for any integer m with mt < 1, and therefore, f(t) > —et for any t 6 [0, 1]. Since 
this is true for any e > 0, we get f{t) > for any t € [0, 1]. □ 

In the following, V^l) denotes the set of all probability measures on f2, and 
for two probability measures with finite second moment on 'P(IR) or 'P(T), where 
T = {zeC:|z| = l}, we define W^C/x, v), the Wasserstein distance by 



(2.2) W 2 {jjl,v)~. inf \x-y\ 2 dn(x,y). 

y Trench) J J 

Here II(/Lt, u) is the set of probability measures on R 2 with marginal distributions 
and and it can be shown that there is at least one solution 7r G n(/i. v) to this 
minimization problem. 

If \x and v are two measures on R with F and G their cumulative distribution 
functions (i.e. F(x) = /i((— oo, x})), then Theorem 2.18 in [9] states that 

»i 



(2.3) (W 2 (p,v)) 2 = / IF-^-G-^dt 

Jo 

where F~ l denotes the generalized inverse of F . 

Theorem 2.4. Let Q : R —> R be a function so that Q(x) — px 2 is convex for a 
certain p > 0. If p,Q is a solution to the minimization problem 

(2.5) I Q := inf E Q U) : 
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where 

(2.6) Efl(jt) = J Q(x)p(dx) - J j log \x - y\p(dx)p(dy), 
then for any p e V(R), we have 

(2.7) p{W 2 {p,p Q )f <E^{p)-lQ. 

In particular, the minimization problem (|2.5|) has a unique solution. 
Proof. There exist constants c\ and c 2 so that 

Q(x) - px 2 > a and - logfli - y\) > --Ax 1 + y 2 ) + c 2 . 



Then for a certain C, we get that 

(2.8) ±(Q(x) + Q(y)) - \og(\x - y\) >^(x 2 +y 2 ) + C> C, 

and this in turn implies that the infimum in (|2.5p is finite (since E® (p) is finite for p 
the uniform distribution on [0, 1]) and in particular J Q(x)dpq(x), and JJ log \x — 
y\dpQ(x)dpQ(y) are finite, which means that pq has finite second moment and no 
atoms. 

Since E^(p) > — oo, we may assume that E®(p) is finite, otherwise there is 
nothing to prove. Then, JJlog|a; — y\p(dx)p(dy) and J Q(x)p(dx) are finite. In 
particular, p has finite second moment and no atoms. 

Taking and F^ Q , the cumulative distributions of p, pq and -F -1 , Fq 1 their 
generalized inverses, set 9(x) = F~ 1 (Fq(x)). According to [HI Theorem 2.18] and 
the discussion following thereafter, the minimizing measure it from (|2.2|) is the 
distribution of x — > (x,8[x)) under pq. In this case, the inequality we want to 
prove becomes 

P J J \x- 9{x)\ 2 p Q {dx) < J Q{x)p{dx) - J J log |a; - y\p{dx)p{dy) - I Q . 
Let / : [0, 1] — * M be given by 

f(t) = -pt 2 [ \9(x) - x\ 2 p Q (dx) + [ Q(t6(x) + (1 - t)x)p Q {dx) 



log(|t(0(x) - 6{y)) + (1 - t)(x - y)\)p Q {dx)p Q {dy) - J«. 

Notice here that / is well defined. Indeed, Q is convex, hence bounded below and 
because J Q{9(x))pQ(dx) = J Q(x)p(dx) and J Q(x)pQ(dx) are both finite, one 
concludes that J Q(t9(x) + (1 — t)x)pQ(dx) is finite too. One the other hand, there 
is a C > so that for any t £ [0, 1], 

- log(|t(0(a;) - %)) + (1 - t)(x - y)\) > -C(9(x) 2 + 9{y) 2 + x 2 + y 2 ) - C, 

which, combined with the finiteness of the second moment of p and pq, results 
with (for a constant C) 

- ff log(\t(6(x) - 9{y)) + (1 - t)(x - y)\)p Q (dx)p Q {dy) > C for all t e [0, 1]. 
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Now, since 9 is a nondecreasing function we can write 

log(|t(0(aO " %)) + (1 - t)(x - y)\)» Q (dx)» Q (dy) = 

\og(t(8(x) - %)) + (1 - t)(x - y))pQ(dx)iiQ(dy), 

x>y 

which combined with the convexity of — log on (0, oo) and the finiteness of J J log \ x- 
y\fj,Q(dx)fiQ(dy) and J/log|x — y\p(dx)p(dy), yields the fact that 



(**) * - - J J log(|*(0(*) - %)) + (1 - t){x - y)\)p Q {dx)p Q {dy) 

is well defined and convex. 

The inequality (|2.7[) is now equivalent to /(l) > 0. To show this, we apply 
Lemma 12.11 The convexity follows easily from the convexity of Q(x) — px 2 and 
(**). Now if vt is the distribution of x — * t0(x) + (1 — t)x under /iq, then the 
minimization property of pq implies that 

fit) > -pt 2 J j \6{x) - x\ 2 p Q (dx) for t G [0, 1], 

and then, Lemma [2~T1 shows that f(t) > for any t G [0, 1]. 

The existence statement follows from the lower continuity of . For a proof of 
the existence and compactness of the support of /iq, see for instance Chapter 6 in 
0- □ 

Remark 2.9. What was essential during the proof was the convexity of — log on 
(0, oo) and the fact that for any a > 0, there is a C(a) so that — \og\x — y\ > 
—a(x 2 +y 2 ) + C(a). Therefore if we replace the log in the statement of this theorem 
by any kernel K(\x — y\) with the property that K on (0,oo) is concave and that for 
any a > 0, there is aC(a) so that — K (\x — y\) > —a(x 2 +y 2 ) + C(a), then the result 
still holds. Other examples of such kernels are —l/x a , a > and l/log(a; 2 + 1). 

If we take Q(x) = and keep in mind that the minimizing measure /iq for E® 
is the semicircular law, one gets the following result proved in [T]. 

Corollary 2.10. Let a(dx) — ^-l[-2,2] (a ; )v / 4 — x 2 dx be the semicircular law on 
[-2, 2]. Then for any fx G V{R), * 

i(W 2 (/i, a)f < ~ J x 2 p(dx)- J J \og{\x ~ y\)p(dx)p(dy) - ~. 



The next theorem is just inequality (JTT4J) . 

Theorem 2.11. Assume Q : T — > K is a function so that Q(e tx ) — px 2 is convex 
on R for a given p > —1/4. If Pq is a solution to the minimization problem 

(2.12) I Q := inf E Q (u), 

^e-P(T) 

wh 



ere 



(2.13) E Q (v) = [ Q(z)v{dz)- ( [ log\z-z'\v{dz)v(dz'), 

J JJtxT 

then, for any v G T 7 we have 

(2.14) (p + 1/4) (W 2 (u,u Q )) 2 < E®(v) - I«. 

In particular, there is a unique solution for the minimization problem (|2.12[) . 
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Proof. Take the exponential map exp : x G M — * e tx G T and for any measure p 
on T, define p(A) = Snez/ x ( ex P(^ ^ [27m, 27r(rj + 1)))). One can show that there 
exists L £ [0, 2tt) such that the restrictions of p and pq to [L, L + 27r) have the 
same mean value. 

We then identify [L, L + 2n) with T via the exponential map and define vq and 
v to be the restrictions of pq and p to the interval [L, L + 2ir). We then follow the 
proof of !2.4l with the necessary adjustments needed. We take the function f(t) here 
to be 

f{t) = -(p+l/A)t 2 J \9{x) - x\ 2 v Q {dx) + J Qie^^+^-^^Qidx) 
■l 0g (| e i(*(fl(*)+(i-t)*)) _ e l{t6 ^ +{l ^ { ^\) VQ {dx)v Q {dy) - /«. 
Now, \e la — e lb \ 2 = 4 sin 2 ((a — £>)/2) for a, & real numbers and 

\8(x) - x\ 2 v Q {dx) = \ [( ((6(x) -x)- (8(y) - y)) 2 VQ (dx) VQ (dy). 



2 . 

Next, set 9t{x) — td(x) + (1 — t)x and notice that 



g(t) -=-J I - x\ 2 v Q {dx) - JJ \og{\e M ^ - e^\)v Q {dx)v Q {dy) 
'-{{6{x) -x)- (8(y) - y)) 2 v Q {dx) VQ {dy) 

log 1 2 sin ( (0 t (x) - t (y))/2) | v Q {dx)v Q (dy) 



j{{6(x) -x)- (0(y) - y)) 2 v Q {dx)v Q {dy) 

log (2sin ((d t (x) - e t {y))/2)) is Q (dx)v Q (dy), 

x>y 

where in the last line we used the fact that 6 is a nondecreasing function. Since 
x, y, 9(x), 9{y) € [— 7T, tt) and for < a < b < 7r, we have 

^ --(a-5) 2 -log(smf 




which implies that the function g is convex on [0, 1]. This coupled with the convexity 
of Q(e lx ) — px 2 concludes that / is a convex function. Finally 

/(*) > -(P + l/4)t 2 / \6{x) - x\ 2 u Q (dx), 



and thus, Lemma [2~T1 shows that /(l) > 0, which is (|2. 14|) . 

The existence of a minimizer follows from the fact that is lower semicontin- 
uous. □ 

For Q — and p = 0, the minimizer of (|2.12p is the Haar measure on T. One 
can check this by showing directly that the uniform measure satisfy the variational 
form of ([2~T2ll . 
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Corollary 2.15. For any fi G V(T) 

Using the same argument as in the proof of Theorem 12. 4[ we can also prove a 
discrete version of it. 

Theorem 2.16. Let Q : R — > R be a function so that Q(x) — px 2 is convex for a 
certain p > 0. For x = (xi, X%, . . . , x n ) G R n , set the energy of x to be given by 

1 ™ 9 
= ~Y,Qfa)--, 77 E 10 SN-^I- 

n n(n — 1) ^-^ 

fc=l v ' l<i<j<ri 

If = EQ(y) = inf{£$(x) : x G R"} ; *Aen /or any x G R" . 
(2.17) p(W 2 ( M (x), M (y))) 2 < £#(x) - 2#(y) = i#(x) - A,? 

where //(x) = ^ X)fc=i <W Moreover, 
(2-18) A« < A« +1 . 

The only statement that needs to be clarified here is (|2.18p . If y n +i is a minimum 
point for E® +1 and y^ +1 denotes the n dimensional vector obtained from y n +i by 
removing the ith component, then A^ +1 = Y^i=i ^niYn+i)' which is obviously 
>A« " " ' 

The minimum points of are called Fekete points in the literature. It is known 
(see for instance chapter 6 in [2]) that lim n ^oo A^ — I® , with 7*3 defined in Ij2.5|> . 
We will reprove this fact below in Proposition ^. 201 

For Q{x) = x 2 , the formula [H A. 6. 11] with the appropriate scaling gives the 
formula for computing A„ = A^ as 
(2.19) 

A« = kl+bsCn-DJ-rr^Eibsi - l-^-f^o, (-U . 
2 n(n— 1)-*— ' 2 n — 1 n ' n — 1 \n— 1/ 

The next statement is a similar result to Theorems 12.41 and 12.161 

Proposition 2.20. Assume Q : R — > R is a function so that Q{x) — px 2 is convex 

for a certain p > 0. Then for any v G 'P(R) and y G R" a Fekete "point for E® , we 
have 

(2.21) p(W 2 (i/, M (y))) 2 <^°M-A« 

Furthermore, if pq is the minimizing measure of E® , and y„ G R™ is a Fekete 
point for E® , then 

(2.22) limA£=/Q and lim W 2 {p Q , M (y n )) = 0, 

n — >oc n — >oo 

hence, p(y n ) — * Pq weakly. 

n — >oo 

Proof. In the first place there is nothing to prove if E^(y) — oo. Therefore we as- 
sume that E®(v) < oo. Integrating (|2.17p with respect to v(dx±)y(dx2) ■ ■ ■ v{dx n ), 
one gets that 

p [ (W 2 (p( X ),p(y))) 2 iy(dx 1 Hdx 2 ) . . . v{dx n ) < E®{v) - A«. 
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We finish the proof of (I2.21| by showing that 

(*) / (W 2 (/i(x), ^y))) 2 u{d Xl )v{dx 2 ) . . . v(dx n ) = (W 2 (v, M (y))) 2 . 

To do this, we proceed by induction. For n = 1, this statement becomes 

(W 2 (5 x ,5 y )) 2 v(dx) = (W 2 (v,6 y )) 2 



which, cf. (|2.3|) . is equivalent to the following (here F v is the cumulative distribution 
function of v) 

\x-y\ 2 v{dx) = f \y-F-\t)\ 2 dt. 



This can be checked by changing the variable in the second integral. 

Assume (*) is true for n — 1, n > 2. A simple application of (|2.3|) gives that 
(W 2 (/u(x),/Lt(y))) 2 = £X)S=i \ x <y(i) ~ Vr{i)\ 2 , where a and r are permutations of 
{1,2, . . . ,n} so that x a{1) < x^ 2 ) • • • < x a{n) and g/ T(1 ) < y T ( 2 ) • • • < y T(n) . If we 
denote by xj the vector x with the ith component removed and similarly for yj, 
one deduces 

1 n 

(#) (w 2 ( M (x) )M (y))) 2 = -]T(WMMxO,M(y,))) 2 - 

i=l 

On the other hand, 

/.(fc+l)/n 

(W 2 (^(y))) 2 = £ / |2/ T(fc) -F-^t)! 2 *, 

which can be used to argue that 

1 " 

(##) {w 2 { V: n{y))) 2 = -J2(w 2 (v,Ky*))) 2 - 

n * — ' 

i=l 

Putting together (#) and (##) and the induction hypothesis one finishes the proof 
of(*). 

To prove $LTZty , we first point out that (PT2l) applied to n Q yields that I Q > 
for any n > 1. In particular this means that is bounded. Since — log \x — y\ > 



4 (x 2 + y 2 ) + c for a certain constant c, we get that A^ > £- Ym=i x i ~ wriere 

„2, 



C is a constant. This implies that the sequence {J x 2 /x(y„)(da;)} n >i is bounded, 
whose consequence is that the sequence of measures n(y n ) is tight, therefore there 
is a weak convergent subsequence /Lt(y„ fc ) to a measure ^. Now, for any L > 0, we 
have 

iin{((Q( a; ) + g( 2/ ))/2-log| a; - 2 /|),LMy„ fe )(d a ;) M (y nfc )(d 2 /) < A« + L/n fe 
and this demonstrates that for any L > 0, 

min{((Q(a;) + Q(y))/2 - log \x - y\), L}v(dx)v(dy) < l9 
and, after passing L — > oo, this yields 

This together with (|2.18p and the uniqueness of fig from Theorem 12.41 ends the 
proof of lim, woo A^ = I®. The rest follows. □ 
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3. Discrete Energy for /3-Ensembles 

In this section we deal with /3-ensembles, which are studied in [4]. These are 
tridiagonal matrices with independent entries of the form 

N(0,2) X{n-l)0 
X{n-l)p N(0,2) X[n-2)0 



A„ 



X 2f3 N{0,2) xp 

XP N(0,2) 



Here N(0, 2) stands for a normal with mean and variance 2, while x-y is the 
^-distribution with parameter 7. The joint distribution of the eigenvalues is 



t n 



\xi — Xjf exp 



l<z<j<n \ i— 1 / 

where here Zp >n is a normalization constant. 

Set jji n — J2k =1 S\ k n , the empirical distribution of the eigenvalues {Xk,n}k=i °f 

A n . 

Theorem 3.1. Set E n = ££ =1 A| - J2i<j<k<n lo S |A» - Aj | fte energy 0/ 

i/ie eigenvalues {Afc}]J =1 of A n . If A„ is tte quantity defined in (|2.19p . i/ien almost 
surely, 

(3.2) lim n(£„ - A n ) = ^(1 + 0/2) - log(/3/2), 

where ip(x) — -£-logT(x) and T i/ie Gamma function. In addition, we have that 

(3.3) n 1 ' 2 (n(E n - A„) - + /3/2) - log(/3/2))) ► JV(0, V'(l + 0/2)), 

where the convergence is in distribution sense. 

The large deviations of n(E n — A„) is governed by the rate function 

R*(t) = sup{tz - R(z) : z € R}, 

09/2 - z) log(/3/2 - z) - log - 09/2) log(/3/2), z < 0/2 

z > [3/2. 



R(z 



z + 
00, 



Proof. The proof is based on a version of Selberg's formula and elementary approx- 
imations involving Gamma function. 
First, we have 



E [exp(zE„)] = 



<i<j<n 



\Xi — Xj\ "-o- 1 ) exp 



{-([3r 



Jr.. IIl<»<j<n \ X i ~ x l\ & ex P ("^ n E"=l x f) dx 



and then, as a consequence of Selberg's formula [6l equation 17.6.7], we get for 
complex z, that 



E [e 2E "] = 



(n/3/2-z/n) " ( " 1} J Il j= i r(l+^/2) 



(«/s/2)-* [( "- 1)P/2+1] n? =1 *» 



00, 



, R(z) < 0/2 
»(*) > 0/2. 
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We need Stirling formula for approximation of Gamma function in the following 
form 



lo g r(t+l) = (* + l/2)logt-i + log(27r)/2 + C 



1 



l + t 



for t > 



Using this and the above formula for E[exp(zi?„)] and (|2. 19[) . after some arrange- 
ments one gets 

(3.4) 



log(E 



,z(E n -A n ) 



n — 1 



log 1 



1 



n- 1 



n — 1 ° \ 2 ii 
n[(n -l)/3 + l] 



z(n+ 1) 
2(n- 1) 



log 1 



log 1 - 



log (1 + | 



(n- l)[n 2 /3/2-z] 



n[(n- l)n/3/2- z] 

n/3. ( 
T log(l- 



2.2 



n(rt — 1)/? 



o(4 



From this, replacing z by nz, one immediately obtains that for any 
log(E[exp(zn(£„-A n ))]) — > z =j= ^ ^.. ; -zlog(/3/2) = z(^(l+/3/2)-log(/3/2)). 
Applying (|3.4[) with z replaced by n 3 / 2 z, one can prove that for any complex z, 
log (E [exp [zn l l 2 {n{E n - A„) - (/(l + /3/2) - log(/3/2)))] ) ^ *V (1+j8/2)/2 



whose consequence is f|3 . 3[) . This, applied for z = ±1 together with Chebyshev 
inequality yields 

P(\n(E n - A„) - + 0/2) - log(/3/2))| > e) < Ce" 6 ™ 1 ' 2 

for a certain constant C > 0. This and an application of Borel-Cantelli's Lemma 
prove (|3.2|) . Again applying (|3.4p with n 2 z in place of z, one can show that 

ilog(E[exp(zn 2 (i;„- A„))]) — > i?(z). 

for any z£l. As a consequence of standard large deviations results (see for example 
Section 2.2 in [3]) we conclude the proof of the last part of the theorem. □ 

Corollary 3.5. E n converges almost surely to 3/4, the energy of the semicircular 
law on [—2, 2]. This implies that the spectral distribution, fx n of A n converges almost 
surely to the semicircular law on [—2, 2]. 



Proof. The convergence of E n to 3/4 follows from (|3.2|) and the fact that the second 
expression in (|2.19|) converges to 1/2 — f Q x\og(x)dx = 3/4. Alternatively, we can 
use Proposition ^. 20l for the convergence of A„ to the free entropy of the semicircular 
law. For the converges of the spectral distribution, we use 12.161 and 12.201 with 
Q(x) — x 2 /2 plus the triangle inequality to justify that almost surely 

W 2 (m«,<t) < W 2 {fi n ,fi{y n ))+W 2 (fi{y n ),o-) < y/2(E n - A n )+y/2(3/A- A n ) — » 0. 

ri — >oo 

a 
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